A versatile computational pipeline for bacterial genome annotation improvement and comparative analysis, with Brucella as a use case

نویسندگان

  • G.X. Yu
  • E.E. Snyder
  • S.M. Boyle
  • O.R. Crasta
  • M. Czar
  • S.P. Mane
  • A. Purkayastha
  • B. Sobral
  • J.C. Setubal
چکیده

We present a bacterial genome computational analysis pipeline, called GenVar. The pipeline, based on the program GeneWise, is designed to analyze an annotated genome and automatically identify missed gene calls and sequence variants such as genes with disrupted reading frames (split genes) and those with insertions and deletions (indels). For a given genome to be analyzed, GenVar relies on a database containing closely related genomes (such as other species or strains) as well as a few additional reference genomes. GenVar also helps identify gene disruptions probably caused by sequencing errors. We exemplify GenVar's capabilities by presenting results from the analysis of four Brucella genomes. Brucella is an important human pathogen and zoonotic agent. The analysis revealed hundreds of missed gene calls, new split genes and indels, several of which are species specific and hence provide valuable clues to the understanding of the genome basis of Brucella pathogenicity and host specificity.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Comparative bioinformatics analysis of a wild diploid Gossypium with two cultivated allotetraploid species

Background: Gossypium thurberi is a wild diploid species that has been used to improve cultivated allotetraploid cotton. G. thurberi belongs to D genome, which is an important wild bio-source for the cotton breeding and genetic research. To a certain degree, chloroplast DNA sequence information are a versatile tool for species identification and phylogenetic implications in plants. Different ch...

متن کامل

PIPE-chipSAD: A Pipeline for the Analysis of High Density Arrays of Bacterial Transcriptomes

PIPE-chipSAD is a pipeline for bacterial transcriptome studies based on high-density microarray experiments. The main algorithm chipSAD, integrates the analysis of the hybridization signal with the genomic position of probes and identifies portions of the genome transcribing for mRNAs. The pipeline includes a procedure, align-chipSAD, to build a multiple alignment of transcripts originating in ...

متن کامل

A Comparative Evaluation of ELISA, PCR, and Serum Agglutination Tests For Diagnosis of Brucella Using Human Serum

Background & Objective: Since the symptoms of Brucellosis are often atypical and nonspecific, using clinical signs alone to diagnose brucellosis is not advised; therefore, the diagnosis relies predominantly on laboratory testing. Currently, molecular, serological, and microbiological methods are used for diagnosis of this disease. In this study we examined ELI...

متن کامل

ASPIC: a web resource for alternative splicing prediction and transcript isoforms characterization

Alternative splicing (AS) is now emerging as a major mechanism contributing to the expansion of the transcriptome and proteome complexity of multicellular organisms. The fact that a single gene locus may give rise to multiple mRNAs and protein isoforms, showing both major and subtle structural variations, is an exceptionally versatile tool in the optimization of the coding capacity of the eukar...

متن کامل

MicroScope: a platform for microbial genome annotation and comparative genomics

The initial outcome of genome sequencing is the creation of long text strings written in a four letter alphabet. The role of in silico sequence analysis is to assist biologists in the act of associating biological knowledge with these sequences, allowing investigators to make inferences and predictions that can be tested experimentally. A wide variety of software is available to the scientific ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره 35  شماره 

صفحات  -

تاریخ انتشار 2007